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1.  PURPOSE 


The  purpose  of  this  project  is  to  conduct  an  engineering  study  of 
Speech  Bandwidth  Compression  Systems  using  the  method  of  a  frequency 
scanning  filter  for  sampling  the  speech  frequency-time  spectrum  and  using 
reverberation  techniques  for  interpolation  between  samplings. 


2.  ABSTRACT 

This  report  describes  the  analysis,  experimentation  and  fabricational 
activity  directed  toward  a  time- frequency  scanning  speech  compression 
system.  A  statistical  analysis  was  made  to  determine  the  scanning  wave¬ 
forms  based  on  articulation  scores  for  selected  zero  crossings  of  infinitely 
clipped  speech.  Ih  addition,  an  empirical  approach  using  sawtooth,  triangular, 
sine- wave  and  rectified  sine-wave  scans  was  employed  to  determine  the  samp¬ 
ling  scan  and  rates  which  would  yield  maximum  articulation.  Experimenta¬ 
tion  shows  that  using  an  Autovox  for  reverberation  improved  articulation. 

This  project  initiated  the  concept  of  a  speech  compression  system  using 
two  filter  scans  to  intercept  formant  changes  and  yield  maximum  articulation. 
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3.  PUBLICATIONS,  REPORTS,  AND  CONFERENCES 


3.1  PUBLICATIONS  -  None. 

3.2  LECTURES  -  None. 

3.3  REPORTS  -  None. 

3.4  CONFERENCES 

3.4.1  The  first  project  conference  was  held  on  21  July  1960  at  PRD 
between  Mr.  Joseph  De  Clerk  of  your  laboratory  and  Mr.  Angelo  P.  Albanese 
of  this  laboratory.  The  purpose  of  this  meeting  was  to  discuss  policy  of  the 
contract  and  to  review  some  of  the  technical  aspects  of  the  project. 

3.4.2  A  project  conference  was  held  on  7  September  1960  at  our  plant 
between  Mr.  Martin  Weinstock  and  Mr.  Joseph  De  Clerk  of  your  laboratory 
and  Mr.  Angelo  P.  Albanese  of  this  laboratory.  This  meeting  outlined  some  of 
the  areas  to  be  investigated  on  this  project. 

3.4.3  A  conference  was  held  on  3  October  1960  to  discuss  Speech  Band¬ 
width  Compression  Systems.  The  personnel  present  were:  Mr.  Martin  Weinstock 
of  your  laboratory.  Dr.  Raisbeck*  of  Bell  Telephone  Laboratories,  Mr.  Kaiser 
of  IDA,  Mr.  Solee*  of  NSA,  Dr.  Carlos  Angulo*  of  Brown  University,  and 

Dr.  M.  J.  DiToro  and  Mr.  Angelo  P.  Albanese  of  PRD.  The  purpose  of  this 
meeting  was  to  determine  the  present  status  of  work  performed  on  Speech  Com¬ 
pression  Contracts.  The  visiting  group  was  concerned  with  the  progress  of 
speech  compression,  possibility  of  improvements,  and  the  overall  program  in 
the  United  States  M^iich  is  devoted  to  speech  compression  systems.  The  formal 
discussion  was  mainly  concerned  with  the  technical  proposal  written  by 
Dr.  M.  J.  DiToro. 


*  On  loan  to  the  Institute  of  Defense  Analysis. 
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3.4.4  A  project  conference  was  held  on  9  December  1960  at  our  plant 
between  Mr.  Joseph  De  Clerk  of  your  laboratory  and  Mr.  Angelo  P.  Albanese 
of  this  laboratory.  This  meeting  supplied  Mr.  De  Clerk  with  information  of 
progress  to  date,  and  an  outline  of  future  areas  of  investigation. 

3.4.5  A  project  conference  was  held  at  our  plant  on  1  February  1961 
between  Mr.  Joseph  De  Clerk  and  Mr.  Frederick  Evans  of  your  laboratory, 
and  Mr.  Angelo  P.  Albanese  of  this  laboratory.  The  purpose  of  this  meeting 
was  to  discuss  the  project' s  future  status  Svithout  the  consulting  services  of 
Dr.  M.  J.  DiToro.  Mr.  De  Clerk  had  been  assured  that  the  requirements  as 
stated  in  the  proposal  would  be  delivered  by  PRD. 

3.4.6  Another  project  conference  was  held  on  9  March  1961  at  our 
plant  between  Mr.  Frederick  Evans  of  your  laboratory,  and  Mr.  Angelo  P.  Al¬ 
banese  of  this  laboratory.  This  meeting  supplied  Mr.  Evans  with  information 
on  the  progress  of  the  project  and  its  future  areas  of  investigation. 

3.4.7  A  project  conference  was  held  on  4  May  1961  at  our  plant  be¬ 
tween  Mr.  Fred  Evans  of  your  laboratory  and  Messrs.  Angelo  P.  Albanese, 
Leon  Zolotnitsky  and  Bernard  ZivatofslQ'  of  this  laboratory.  Individual  con¬ 
ferences  were  held  to  display  the  equipment  and  report  on  project  progress 

to  date.  Future  endeavors  were  outlined  and  discussed  with  anticipated  system 
results. 

3.4.8  A  project  conference  was  held  on  25  May  1961  at  our  plant  be¬ 
tween  Messrs  Martin  Weinstock,  Joseph  De  Clerk,  and  Fred  Evans  of  your 
laboratory  and  Dr.'  L.  S.  Castriota  and  Angelo  P.  Albanese  of  this  laboratory. 
This  conference  was  a  demonstration  of  the  speech  compression  system  and 
a  discussion  on  future  modification  of  the  present  equipment.  Another  topic 
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of  discussion  was  a  presentation  of  a  new  concept  for  achieving  speech  com¬ 
pression.  Details  were  disclosed  concerning  a  new  and  improved  speech 
compression  system  using  two  scanning  filters  to  sample  the  speech  spec¬ 
trum.  At  this  time,  PRD  had  asked  for  an  extension  of  three  months  to  com¬ 
plete  work  on  the  project  at  no  extra  cost  to  Fort  Monmouth. 

3.4.9  A  project  conference  was  held  at  our  plant  on  17  July  1961 
between  Mr.  Fred  Evans  of  your  laboratory  and  Messrs  Angelo  P.  Albanese 
and  Leon  Zolotnitsky  of  this  laboratory.  This  meeting  informed  Mr.  Evans 
of  the  progress  to  date  and  related  the  extension  program. 

3.4.10  A  project  conference  was  held  at  our  plant  on  27  September  1961 
between  Mr.  Joseph  De  Clerk  of  your  laboratory  and  Dr.  L.  J.  Castriota  and 
Mr.  Angelo  P.  Albanese  of  this  laboratory.  The  final  developments  and  re¬ 
sults  of  this  project  were  discussed.  Delivery  of  the  project  system  with 
technical  procedures  were  arranged. 
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4.  FACTUAL  DATA 


4.1  INTRODUCTION 

The  application  of  speech  bandwidth  compression  to  voice  communication 
promises  better  use  and  improved  performance  of  existing  communication  links. 
With  increased  traffic  demands  on  these  communication  links,  channel  capacity 
economy  is  desirable. 

Speech  bandwidth  or  channel  capacity  reduction  systems  are  normally  di¬ 
vided  into  four  groups*.  (1)  Time  or  frequency  compression:  Sampling  or 
frequency  division  techniques  with  a  bandwidth  compression  of  four  permit  a 
binary  channel  capacity  of  5,000  to  10,000  bits  per  second.  This  project  falls 
within  this  category.  (2)  Continuous  analysis-synthesis:  Transmission  of 
analog  control  signals  in  place  of  speech  signals  yields  bandwidth  compression 
of  about  2,000  bits  per  second.  (3)  Discrete  sound  analysis:  Here  speech  signal 
code  groups  substitute  for  the  speech  signal  and  identify  fundamental  sounds  elim¬ 
inating  emotional  and  personal  cues,  l^stem  capabilities  involve  information  rates 
as  low  as  60  bits  per  second.  (4)  Sound  group  analysis-synthesis:  Transmission 
of  certain  words  or  phrases  identified  by  code  groups  have  estimated  useful  infor¬ 
mation  rates  of  5  to  10  bits  per  second.  However,  such  rates  are  a  function  of  the 
scope  of  vocabulary  used. 

4.2  THEORY  OF  FILTER  SCANNING 

The  speech  compression  technique  of  this  project  is  best  shown  by  referring 
to  figure  1.  Here  an  energy  vs  frequency  vs  time  spectrograph  of  a  voiced  speech 
sound  is  displayed.  The  pattern  shown  in  the  spectrograph  is  sampled  by  the  scan¬ 
ning  filter  of  bandwidth  Af.  Time  per  scan  is  designated  as  T  and  the  range  of 
speech  frequency  scanned  is  250  to  3250  cps. 

*  S.  J.  Campanella,  A  Survey  of  Speech  Bandwidth  Compression  Techniques. 

IRE  Trans  Audio,  Sept  -  Oct  1958,  p.  105 
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Figure  lb  is  a  representation  of  filter  scanning.  Figure  Ic  shows  the  output 
energy  of  this  filter, indicating  that  it  contains  sufficient  sampling  to  permit  an  ap¬ 
proximate  reconstruction  of  the  original  spectrograph.  Figure  Id  shows  the  restored 
spectrum,  viliere  with  reverberation,  interpolation  between  samples  is  achieved. 

A  block  diagram  of  a  basic  system  performing  these  functions  is  indicated  in 
figure  2.  At  the  transmitter  the  speech  spectrum  input  of  250  to  3250  cps  is  scanned, 
and  the  compressed  speech  signal  shifted  to  a  common  center  frequency  f  or  20  kc. 

3. 

This  upward  translation  in  frequency  is  effected  with  a  balanced  modulator  and  a  saw¬ 
tooth  scan  of  the  local  oscillator.  The  sawtooth  scan  and  the  resultant  spectrum  shift 
is  shown  in  figure  3a.  The  sawtooth  scan  illustrated  is  merely  representative  since 
other  waveforms  may  be  generated  for  scanning  by  the  function  generator. 

The  scanning  rate  ^  of  the  function  generator  is  controlled  by  a  fixed  sinu¬ 
soidal  synchronizing  signal  of  frequency  f  derivable  from  — .  As  stated  pre- 

s  T 

viously,  the  compressed  speech  signal  has  shifted  to  a  common  center  frequency  f 
or  20  kc.  The  waveform  for  this  compressed  signal  is  shown  in  figure  3c,  while 
figure  3b  shows  the  original  speech  signal.  The  fixed  synchronizing  frequency  f 

s 

is  chosen  to  be  at  either  edge  of  the  link  bandwidth  and  so  can  be  readily  transmitted 
and  consequently  extracted. 

At  the  receiver  the  compressed  signal  is  subjected  to  an  inverse  process. 

The  sampled  spectrum  is  approximately  restored  to  the  original  speech  spectrum 
by  means  of  reverberation.  The  reverberators  used  are  commercial  units  such  as 
Fisher*  s  Spacexpander  and  Kay*  s  Autovox,  and  a  feedback  amplifier. 

Restoration  is  possible  if  the  sampling  interval  T  of  Figure  lb  is  the  recip¬ 
rocal  to  twice  the  bandwidth*  Af^  of  the  signal  vs  time  representing  the  formant 
center  frequency.  The  compressed  bandwidth  ratio  for  this  system  is  A  ,/(2fc)  . 

where  f  is  the  original  bandwidth  of  the  speech  signal  and  f  is  the  effective  cutoff 
o  c 

*C.  E.  Shannon,  Communication  in  the  Presence  of  Noise,  Proc.  IRE,  Jan.  1949,  p.  10. 


6 


frequency  vs  time  signal.  With  restoration  by  means  of  reverberation,  this 
system  improves  articulation  by  providing  a  continuity- in- sampling  to  the 
ear-brain  chain. 

4.3  SYSTEM  ANALYSIS 

The  major  difference  between  the  general  and  specific  project  system 
(figures  2  and  4)  is  that  the  latter  system  employs  a  single  scanning  oscillator 
for  purposes  of  system  simplicity.  This  project  envisioned  no  physical  sep¬ 
aration  of  transmitter  and  receiver  sections, so  that  the  basis  of  compression 
could  be  illustrated  with  the  use  of  a  single  scanning  oscillator.  Explanatory 
system  details  follow. 

First,  the  Krohn-Hite  filter  limits  the  speech  spectrum  input  from 
250  to  3250  cps.  This  serves  to  transmit  the  first  three  formants  which  con¬ 
tain  information  yielding  articulation  scores  of  better  than  95%.  A  reference 
bandwidth  of  3  kc  will  serve  to  determine  compression  ratios. 

Simultaneously,  the  function  generator  and  scanning  oscillator  produce 
a  frequency-modulated  signal.  The  function  generator  furnishes  a  variety  of 
scanning  waveforms  which  modulate  the  local  oscillator,  the  scanning  oscillator 
being  an  astable  multivibrator  whose  base  return  voltage  is  changed  by  the  out¬ 
put  of  the  function  generator.  The  asymptotic  voltage  created  by  this  scanning 
waveform  will  generate  a  variety  of  frequencies.  Although  the  oscillator  wave¬ 
form  at  the  collector  is  a  square  wave,  the  balanced  modulator  and  filter  com¬ 
bination  will  only  permit  the  fundamental  frequency  to  mix  with  the  speech 
spectrum. 
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Referring  now  to  the  schematic  in  figure  5,  the  scanning  oscillator  is  coupled 
to  the  balanced  modulator  via  transistor  and  transformer  combination.  Besides  pro¬ 
viding  a  proper  amplitude  level  for  modulation,  the  balanced  modulator  serves  as  an 
impedance  match.  Since  the  balanced  modulator  is  composed  of  matched  diodes  in 
a  bridge  circuit,  the  diodes  are  switched  on  and  off  by  the  polarity  of  the  incoming 
signals  resulting  in  carrier  suppresion. 

The  narrow  bandpass  filter  is  an  L-C  network  that  samples  the  speech  spec¬ 
trum  in  a  manner  dictated  by  the  scanning  waveform.  Normally,  amplitude  response 
is  the  criterion  of  filter  performance,  but  here  phase  response  or  phase  deviation 
from  linearity  can  contribute  to  frequency  dispersion  and  impair  the  signal.  Con¬ 
sequently,  the  signal  output  of  the  filter  and  signal  output  of  the  scanning  oscillator 
would  not  be  time  synchronized.  The  delay  within  the  narrow  bandpass  filter  would, 
therefore,  produce  a  signal  at  the  receiver  portion  uncorrelated  to  the  transmitter 
output.  The  Golay  delay  line,  an  adjustable  L-C  network  compensates  the  average 
phase  variation,  and  assures  correct  restoration  of  the  speech  spectrum  at  the  output 
of  the  demodulator.  Spectrograph  figure  6  illustrates  the  distortion  without  the  use 
of  the  delay  line  and  its  correction  wiien  the  delay  line  is  added. 

Obviously,  at  the  output  of  the  demodulator,  the  audio  spectrum  appears  with 
the  addition  of  the  hi^er  frequency  sidebands.  These  sidebands  are  attenuated  by 
the  Krohn-Hite  bandpass  filter.  Although,  the  signal  appearing  at  the  output  of  this 
filter  is  continuous  in  time,  spectral  time  samples  develop  as  a  result  of  the  scan¬ 
ning  rate  and  the  narrow  bandpass  filters. 

Referring  now  to  figure  4,  after  the  speech  spectrum  signal  has  been  restored 
by  demodulation,  the  signal  enters  the  reverberator.  Here  the  Autovox  reiterates  the 
sampled  spectrum  one  or  more  times.  The  number  of  repetitions  will  be  determined 
by  the  compression  ratio,  waveform,  etc.  The  amplifier  paralleling  the  Autovox  ex- 
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tends  the  number  of  repetitions  to  three.  Ballantine  amplifiers  appearing  through 
out  figure  4  amplify  and  match  impedances  at  locations  shown. 


4.3.1  Scanning  Waveforms. 

Before  we  can  choose  the  scanning  waveform  which  will  sample  the  speech 
spectrum,  we  must  know  something  about  the  information  content  of  speech.  Ob¬ 
viously,  no  information  can  be  extracted  if  we  observe  the  amplitude  versus  time 
waveform  of  speech,  other  than  the  fundamental  pitch.  However,  by  observing  the 
sound  spectrograph  we  get  a  representative  display  of  where  the  information  exists. 

If  we  observe  a  typical  sound  spectrograph  figure  la,  it  can  be  seen  that  the  formants 
present  will  be  time  variable.  The  fact  that  each  formant  changes  frequency  constitutes 
modulation  of  sound  contributing  to  intelligibility.  Variations  of  sound  are  attributed  to 
the  change  in  frequency  of  the  hi^  intensity  formants.  Although,  we  cannot  predict 
where  these  changes  occur  we  can  attempt  to  choose  a  scanning  waveform  based  on 
previous  experimentation. 

If  there  is  absolutely  no  correlation  of  the  scanning  waveform  to  the  changes  of 
the  formant  bars,  then  linear  scanning  by  sawtooth  waveform  will  serve  to  sample  in¬ 
formation  content  with  the  same  degree  of  effectiveness  as  any  other  waveform.  How¬ 
ever,  since  this  is  uncertain,  we  can  only  attempt  to  use  other  waveforms  and  compare 
the  results  experimentally. 

Another  aspect  involves  the  discontinuity  present  in  the  sawtooth  scan  which 
will  produce  high  order  transients  resulting  in  low  articulation  scores*.  To  over¬ 
come  this  and  still  maintain  an  equal  time  sample,  the  triangular  scan  can  be  used  to 
reduce  these  undesirable  transients.  However,  since  the  lower  formant  contains  the 
highest  contribution  to  intelligibility,  it  would  seem  advantageous  to  use  a  waveform 

*D.  L.  Subrahmanyam  and  G,  E.  Petersen,  Time- Frequency  Scanning  in  Narrow- Band 
Speech  Transmission,  IRE  Trans.  Audio,  Nov  -  Dec  1959,  p.  148. 
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that  would  scan  the  lower  formant  for  a  longer  period  of  time.  Such  a  typical 
waveform  is  the  negative  half-wave  rectified  sine  wave.  A  mathematical  treat¬ 
ment  for  this  waveform  is  contained  in  Appendix  B  of  the  First  Quarterly  Re¬ 
port.  Again,  since  no  a  priori  information  is  available  as  to  information  con¬ 
tent  of  speech,  the  positive  half-wave  rectified  sine-wave  scan  is  used  for 
comparative  purposes.  Even  these  rectified  scans  present  non-linear  sampling 
with  some  discontinuity  .  Therefore,  the  sinusoidal  waveform  with  its  inher¬ 
ent  smooth  transitions  will  be  used  to  emphasize  maximum  sampling  time  at 
the  lower  and  higher  formants. 

Depending  on  the  scan  used,  the  reverberator  must  fill  in  the  gap  between 
samples,  hi  the  case  of  the  linear  type  of  scan,  e.g.,  sawtooth,  triangular,  —  with 
reverberation,  the  entire  spectrum  will  be  continuous.  However,  in  non-linear 
scanning,  the  reverberator  will  fill  in  the  gap,  will  not  fill  in  the  gap  and/ or 
will  overlap  the  gap  between  samples.  No  mathematical  analysis  will  be  eval¬ 
uated  for  the  reverberator;  however,  this  factor  will  be  governed  and  deter¬ 
mined  experimentally. 

As  described,  the  waveforms  discussed  contain  no  known  correlation  to 
the  information  content  of  the  speech  spectrum.  In  order  to  develop  some  corre¬ 
lation,  the  following  statistical  method  of  inquiry  was  used. 

4.3. 1.1  Statistical  Analysis  of  Speech  Spectrum. 

Experimental  investigation  shows  differentiated  and  infinitely  clipped  speech 

has  an  articulation  score  of  95%  **.  From  this  we  would  infer  that  the  information 

lies  in  the  relative  positions  of  successive  zero  crossing  of  differentiated  and  in- 

fintely  clipped  speech  signals.  Application  of  the  spectral  content  of  these  zero 

crossings  will  be  used  to  generate  a  scanning  waveform  which  will  "on  the  average" 

sample  the  speech  spectrum  optimally.  It  is  therefore  necessary  to  resolve  the 

*  Here  discontinuity  is  defined  as  the  point  which  has  no  derivative  as  is  apparent 
at  the  cusp. 

**  J.  Licklider  and  I.  Pollack,  Effect  on  Differentiation.  Integration  and  Infinite 
Clipping  Upon  hitelligibility  of  Speech.  JASA,  1948,  vol.  20,  p.  42. 
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problem  statistically.  Since  the  zero  crossings  are  correlated  to  the  maxima  and 
minima  of  speech  amplitude,  statistical  results  can  be  achieved  simply  by  determin¬ 
ing  the  relative  frequency  of  occurrence  for  specified  periods.  The  ultimate  issue 
can  therefore  be  framed  in  the  question: "What  is  the  probability  density  of  having  a 
zero  crossing  with  positive  slope  at  time  t^  if  it  is  known  that  there  exists  a  zero 
crossing  with  negative  slope  wiiere  t  is  equal  to  0  ?" 

To  establish  this  probability  density  of  time  lapse  between  successive  events, 
i.e.,  time  of  successive  zero  crossings,  the  experimental  apparatus  and  schematics, 
figures  7,  8  and  9,  are  used.  Essential  circuits  include  the  clipper  amplifier  and 
distribution  analyzer. 

An  explanation  of  this  circuitry  may  be  facilitated  by  referring  to  these  figures. 
With  the  exception  of  the  clipper  amplifier,  all  the  circuitry  is  transistorized.  In  the 
clipper  amplifier,  each  stage  is  preceded  by  a  pair  of  silicon  diodes  which  limit  the 
voltage  excursion  at  the  tube  input.  The  circuitry  of  the  distribution  analyzer,  in¬ 
cludes  the  delay  multivibrator,  triggering  circuits  for  the  pulse  generator,  and  an 
"and"  gate.  With  this  circuitry,  the  differentiated  speech  signal  is  clipped  and  am¬ 
plified  and  converted  to  a  random  series  of  square  waves.  Differentiation  of  the 
square  waves  produces  positive  and  negative  spikes  which  are  time  markers  for  the 
location  of  positive  and  negative  slope  zero  crossings  of  the  processed  speech  signal. 
The  negative  pulse  is  delayed  for  a  period  of  t^  by  the  delay  multivibrator  whose 
trailing  pulse  output  triggers  pulse  generator  no.  1.  A  pulse  width  At  is  produced 
which  activates  the  "and"  gate. 

The  succeeding  positive  spike  at  the  output  of  the  clipper  amplifier  triggers 
pulse  generator  no.  2  which  produces  a  narrow  pulse.  If  the  time  lapse  between  these 
two  spike  waveforms  is  between  t^,  and  t^  +  At,  the  output  from  pulse  generator  no.  2 
will  find  the  "and"  gate  open  and  will  be  registered  by  the  counter.  If  the  positive  spike 
follows  the  negative  spike  by  less  than  time  t^^,  this  latter  pulse  will  reset  the  multi- 
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vibrator  and  prepare  the  system  for  the  next  pair  of  pulses.  In  this  way,  no  zeros 
are  lost  and  a  reasonably  accurate  distribution  can  be  obtained.  The  period  t^  is 
set  by  a  variable  capacitor  in  the  multivibrator  and  for  a  particular  setting  establish 
ing  t^ ,  the  pulse  count  will  be  N^. 

Thus,  if  the  total  number  of  pairs  of  zeros  in  the  speech  sample  is  N^,  the 
probability  of  the  distance  between  adjacent  zero  between  t^,  and  t^  and  At  is 
N^/N^  At.  By  taking  data  for  different  values  of  t^,  the  entire  density  function  may 
be  obtained  (figure  10). 

The  probability  density  curve  for  maxima- minima  spacings  is  shown  in 

figure  10a.  If  therefore,  clipped  differentiated  speech  passes  through  a  pulse 

width  selector  set  for  a  pulse  width  between  t  and  t  ,only  those  pulses  whose  width 

2  1. 

lies  in  this  range  will  pass.  (See  figure  10b.)  Then,  the  average  rate  of  pulses  at 
the  output  will  be  the  product  of  the  shaded  area  of  the  curve  and  the  average  pulse 
rate  in  the  original  signal.  This  involves  dividing  the  curve  into  a  number  of  equal 
areas  as  indicated  and  obtaining  articulation  scores  for  each  section  by  means  of  the 
pulse  width  filter.  The  equipment  can  be  modified  slightly  to  function  as  a  pulse 
width  selector  since  the  counter  need  only  be  replaced  by  a  monostable  multivibrator 

Pulse  generator  No.  1  is  adjusted  for  a  pulse  width  t  and  t  .  For  every  pulse 

^  X 

in  the  clipped  speech  with  a  pulse  width  between  t  and  t  ,  the  multivibrator  will  be 

M  X 

triggered.  It  will  put  out  a  pulse  of  width: 

t 

— tp(t)dt 

V‘i  ) 

This  is  the  average  pulse  within  the  interval*. 


This  integral  is  the  "expected  value"  of  the  pulse  width  for  pulses  whose  widths 
are  within  the  specified  range. 
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The  output  waveform  will  be  a  pulse- code-modulated  signal.  This  signal  will 
be  used  to  measure  the  articulation  of  the  areas  shown.  Its  spectral  content  will  be 
arrived  at  by  determining  its  autocorrelation  function  from  which  the  spectral  den¬ 
sity  can  be  found  by  using  the  Wiener- Khinchin  theorem,  i.e.,  the  autocorrelation 
function  and  spectral  densities  are  Fourier  transform  pairs.  Therefore,  the  signal 
energy  can  be  related  to  the  articulation  score  yielding  on  the  average  the  required 
scanning  waveform  to  optimize  information  transmission. 

No  further  work  in  this  area  was  conducted  due  to  limited  funds  and  time.  It 
appeared  more  advantageous  to  continue  work  using  periodic  waveforms  to  scan  the 
speech  spectrum. 


*M.  Schwartz,  Mormation  Transmission.  Modulation  and  Noise.  McGraw-Hill,  N.Y. 
1959,  p.  431. 
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4.3.2  System  Components. 


4.3. 2.1  Scanning  Oscillator. 

The  scanning  oscillator  design  has  a  frequency  variation  of  3  kc  which 
is  the  bandwidth  of  the  speech  spectrum  to  be  compressed.  A  center  frequency 
of  20  kc  for  the  scanning  oscillator  was  selected  because  20  kc  filters  were 
readily  obtainable  and  this  frequency  lies  outside  the  audio  band. 

The  scanning  oscillator  is  a  transistorized  astable  multivibrator  and  was 
a  preferred  design  because  of  greater  reliability  achieved  through  a  reduced  num¬ 
ber  of  components  and  because  of  superior  stability.  Consider  the  schematic  and 
tuning  curve  figures  11  and  12.  By  observing  the  base  return  voltage  waveform 
we  can  see  that  the  cutoff  voltage  permits  alternate  conduction  of  both  transistors. 
The  time  sequence  of  these  transitions  is  dependent  upon  the  base  resistor,  col¬ 
lector  to  the  base  capacitor,  and  base  return  voltage.  Frequency  variation  is 
achieved  by  allowing  the  base  return  to  seek  changes  in  the  voltage  asymptote. 
Referring  to  the  tuning  curve  figure ,  note  that  the  frequency  excursion  is  rela¬ 
tively  large  compared  to  the  small  driving  voltage  used. 

Originally  a  phase  shift  oscillator  figure  13  was  designed  for  the  scanning 
oscillator.  Frequency  variations  are  attained  by  driving  non-linear  resistances 
figure  14,  with  the  voltage  waveform  of  the  function  generator.  Since  one  tube 
serves  as  the  load  in  a  cascode  arrangement, the  overall  circuit  will  change  re¬ 
sistance  as  a  function  of  grid  voltage  at  the  lower  tube.  As  grid  voltage  changes, 
the  resulting  current  produces  a  change  in  plate  resistance  at  the  upper  tube 
resulting  in  a  non-linear  operation.  The  capacitors  and  non-linear  resistance 
determine  the  frequency  at  which  the  loop  gain  is  equal  to  1  +  jO.  Gain  is  pro¬ 
vided  by  the  12AX7  amplifier  in  the  closed  loop  to  assure  that  this  value  of  A/8 
is  maintained  for  stable  oscillation. 


Undesirable  performance  became  apparent  when  it  was  found,  one,  that 
the  tuning  curve  was  non-linear,  and  two,  that  output  amplitude  as  a  function 
of  frequency  varied.  In  an  attempt  to  solve  the  first  problem,  three  cascoded 
stages  were  used.  Some  improvement  was  shown  but  this  was  insufficient  to 
satisfy  our  linear  requirements.  In  an  attempt  to  solve  the  second  problem,  we 
used  a  frequency  selective  network  to  control  the  amplitude  over  the  excursion 
of  frequencies  desired.  Again,  no  satisfactory  results  were  obtained  for  our 
purposes. 
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4. 3. 2. 2  Reverberator 


Selection  of  a  reverberation  unit  involves  procurement  or  manufacture  of  a 
suitable  instrument  capable  of  repeating  signals  a  number  of  times  corresponding 
to  the  compression  ratio.  This  in  turn,  depends  on  the  specifications  of  the  speech 
compression  system.  The  delay  unit  should  pass  the  speech  frequency  spectrum  of 
250  to  3250  cps  with  a  flat  response  to  within  ±3  db.  A  continuously  variable  delay 
of  10  to  100  milliseconds  adjustable  in  1  millisecond  increments  is  a  further  re¬ 
quirement.  The  reverberator  should  have  a  characteristic  impedance  in  the  order 
of  1000  ohms,  and  a  transmission  loss  of  4  to  6  db  would  be  acceptable.  However, 
in  order  to  proceed  expeditiously,  some  of  these  requirements  were  waived  to  pro¬ 
cure  initial  data.  Our  procedure  for  this  and  other  items  was  that  if  commercial 
units  were  available  or  could  be  easily  modified,  then  the  unit  was  purchased.  If 
cost,  modifications,  or  delivery  did  not  make  this  a  feasible  procedure,  then  the 
unit  was  designed  and  built. 

4.3. 2.2.1  Electrical  Delay  Line 

Ultrasonic  delay  lines  of  the  magnetostrictive  (e.g.,  Ferranti  Electric  Co.)  and 
piezoelectric  (e.g.,  Bliley  Electric  Co.)  types  were  rejected, because  they  required  a 
100  kc  carrier  which  would  complicate  the  design  of  the  speech  compression  system. 

Electrical  networks  were  then  considered  and  after  a  similar  survey,  the  ESC 
Corporation,  Palisades  Park,  N.  J.,  supplied  a  3  millisecond  delay  line  for  experi¬ 
mentation.  Although  ESC  was  ready  to  build  to  our  specifications,  the  price  was  high 
and  delivery  uncertain.  The  3  millisecond  line  was  tested  and  data  taken,  (figure  15). 

4.3. 2. 2. 2  Acoustical  Delay  Line 

Acoustical  delay  lines  were  investigated  and  the  Fisher  Spacexpander  pro¬ 
cured.  The  Spacexpander  consists  of  an  amplifier  following  a  delay  line 
(electromagnetic  transducers)  and  a  potentiometer  at  the  output  w^iich  controls  the 
decay  period  after  a  33  millisecond  delay.  The  Spacexpander  also  contains  a 
switch  which  turned  to  REVERB  ONLY  permits  the  reverberated  signal  to  pass. 
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while  in  the  MIX  position  both  original  and  reverberated  signals  pass.  The 
l^acexpander  is  manufactured  by  Hammond  Organs,  Inc.,  of  Chicago,  HI., 
who  refused  to  modify  the  unit  and  since  we  were  unsuccessful  in  modifying 
the  Spacexpander,  we  decided  to  develop  our  own  delay  line.  Since  the 
Spacexpander  was  comparatively  inexpensive  and  readily  available,  we  pro¬ 
ceeded  to  use  the  Spacexpander  for  our  initial  work. 

4.  3.  2. 2.  3  Test  Delay  Line 

Preliminary  experiments  were  made  with  round  and  rectangular  bars 
of  brass,  copper  and  aluminum  using  the  test  fixture,  figure  16a  The  follow¬ 
ing  table  was  considered  in  choosing  the  metals  and  dimensions  for  test  speci¬ 
mens. 

TABLE  1 

SOUND  VELOCITY  IN  VARIOUS  MATERIALS 
(Handbook  of  Chemistry  and  Physics,  1957,  p.  2319) 


Material 

Sound 
Velocity 
(meter/ sec) 

Sound 
Velocity 
(ft/ sec) 

Length 

Longitudinal  Waves 
(meter/msec) 

Length 
Shear  Waves 
(meter/msec) 

Aluminum 

5104 

16,740 

5 

2.5 

Silver 

2610 

8553 

2.6 

1.3 

Brass 

3500 

11,480 

3.5 

1.75 

Tin 

2500 

8200 

2.5 

1.25 

Copper 

3560 

11,670 

3.4 

1.7 

Zinc 

3700 

12,140 

3.7 

1.85 

Iron  &  Soft  Steel  5000 

16,410 

5 

2.5 

Ivory 

3013 

9886 

3.0 

1.5 

Nickel 

4973 

16,320 

5 

2.5 

Because  initial  data  showed  minute  delays,  further  experiments  were 
made  with  test  fixtures  and  setups,  figure  16  and  17.  Test  data  curves,  figure  18, 
indicate  delay  and  frequency  response  similar  to  a  comb  filter. 


Discrepancies  in  results,  (see  figure  18),  can  be  ascribed  to  temperature  varia¬ 
tions,  inconstant  stylus  positions  and  pressures,  etc.  Attempts  for  consistent 
results  involved  dismantling  and  reassembly  of  the  test  fixture  and  comparisai  of 
results,  figures  18c,  d..  Note  the  use  of  a  100  k  carbon  resistor  to  obtain  a  flat¬ 
tened  curve. 


4. 3. 2. 2.4  Magnetic  Delay  Line 

Magnetic  recorder  types  were  then  investigated  and  the  following  firms 
were  consulted  regarding  magnetic  tape  reverberators: 


American  Geloso  Electronics,  Inc. 
Ampex  Corp. 

Amplifier  Corp.  of  America 
Audio  Master  Corp. 

Bogen-Presto  Co. 

Brush  Instruments,  Inc. 

Dictaphone  Corp. 

Edwards  Engineering  Co. 

Fairchild  Recording  Equipment  Co. 
Federal  Mfg.  &  Engineering  Co. 

Kay  Electric  Co. 

Monroe  Calculating  Machine  Co., Inc. 
Telectro  Industries,  Inc. 

Whorf  Engineering,  Ltd. 


New  York,  N.  Y. 
Redwood  City,  Calif. 
New  York,  N.  Y. 

New  York,  N.  Y. 
Paramus,  N.  J. 
Cleveland,  Ohio 
New  York,  N.  Y. 

Port  Washington,  N.  Y. 
New  York,  N.  Y. 
Garden  City,  N.  Y. 
Pinebrook,  N.  J. 
Orange,  N.  J. 

Long  Island  City,  N.  Y. 
Warwickshire,  England 


The  Autovox  manufactured  by  Kay  Electric  Co.,  Pinebrook,  N.  J.,  was 
finally  selected.  In  the  Autovox,  adjusting  the  position  of  heads  at  the  periphery 
of  magnetic  disks,  produces  suitable  delays.  With  auxiliary  equipment  including 
a  Missilyzer  recorder  for  speech  spectrographs,  we  were  now  able  to  proceed 
with  our  investigation. 
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4.3.2.  3  Narrow  Band  Pass  Filter 


A  narrow  band  pass  filter  was  required  in  the  scanning  filter  system  to 
accomplish  bandwidth  compression  by  spectrum  sampling.  Since  no  variable 
filters  of  this  type  existed,  separate  fixed  filters  were  considered.  Specifica¬ 
tions  for  four  filters  are  listed  in  the  following  table: 

TABLE  2 

SPECIFICATIONS,  NARROW  BAND  PASS  FILTERS 
Characteristic  Filter  #1  Filter  #2  Filter  #3 _ Filter  #4 


Center  frequency 

20,000  cps 

20,000  cps 

20,000  cps 

20,000  cps 

Bandwidth  (3  db  attenuation) 

1500 

1000 

750 

500 

Attenuation 

60  db  min. 

60  db  min. 

60  db  min. 

60  db  min. 

Shape  Factor  60  db\ 

^Bandwidth  6  db  / 

2  to  3 

2  to  3 

2  to  3 

2  to  3 

Input  and  Output  Impedance 

600  ohms 

600  ohms 

600  ohms 

600  ohms 

Phase  Variation  (Linearity 
with  Bandwidth) 

10% 

10% 

10% 

10% 

Approximate  Compression  Ratio, 

/  Speech  Spectrum  BandwidthX  „  ^ 

\  Filter  Bandwidth  / 

3:1 

4a 

6:1 

L-C  filters  were  investigated  and  Deeco  Instruments  of  Van  Nuys,  California, 
were  able  to  supply  filters  meeting  the  filter  #1,  2,  3  specifications.  Filter  #4  was 
not  procurable  since  this  was  extremely  expensive  and  difficult  to  manufacture. 

Deeco  filters  BP-492-1500,  BP-492-1,000  and  BP-492-750,  were  tested,  (figure  19) 
and  used.  Although  the  Deeco  filters  were  satisfactory  and  were  used  in  the  system; 
however,  the  imperfection  in  phase  linearity  resulted  in  some  dispersion.  The  effects 
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of  this  dispersion  constituted  frequency  distortion  ^en  restoring  the  speech 
spectrum. 


Since  no  acceptable  commercial  filter  was  available  previously,  a 
1500  cps  band-pass  filter  was  designed  and  tested.  Theoretical  curves,  filter 
schematics  and  the  frequency  respcxise  of  this  designed  filter  are  shown  in 
figures  20,  21  and  22. 

Initially,  consideration  was  given  to  the  use  of  mechanical  filters  as  de¬ 
scribed  by  Doelz  and  Hathaway,  Electronics,  March  1953,  and  Hathaway  and 
Babcock,  IRE  Proc.,  Jan.  1957.  The  following  manufacturers  were  consulted 
regarding  mechanical  filters  meeting  our  specifications: 

Collins  Radio  Co.  Cedar  Rapids,  Iowa 

Burnell  &  Co.  Pelham  Manor,  N.  Y. 

Raytheon  Co.  Newton  58,  Mass. 

It  was  found  that  no  mechanical  filters  vt^iich  would  satisfy  or  approach 
our  requirements  were  available  or  could  be  ordered  under  reasonable  condi¬ 
tions.  Though  the  problem  could  probably  be  solved  by  using  a  multi-element 
crystal  filter  of  the  type  studied  by  the  Hermes  Division  of  Itek  Corporation, 
Cambridge,  Mass.,  these  were  in  the  developmental  stage  and  unavailable.  The 
following  L-C  filter  manufacturers  were  therefore  consulted  about  supidying 
lumped  constant  filters  to  meet  our  requirements; 


Freed  Transformer  Co.  Brooklyn,  N.  Y. 

Universal  Toroid  Coil  Winding  Co.  Irvington,  N.  Y. 


G.  B.  Electronic  Co. 

V.  T.  C.  Corporation 
North  Hills  Electric  Co. 
Deeco  Instruments  Co. 
Raytheon  Co. 

Magnetic  Systems  Co. 


Valley  Stream,  N.  Y. 
New  York,  N.  Y. 
Mineola,  N.  Y. 

Van  Nuys,  Calif. 

North  Hollywood,  Calif. 
Monrovia,  Calif. 
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4. 3.2.4  Adjustable  Band  Pass  Filter 


Another  and  different  filter  aspect  involved  the  elimination  of  speech 
information  being  transmitted  through  the  scanning  oscillator,  to  the  receiver 
portion  of  the  system.  High-pass  filtering  was  required  and  the  Krohn-Hite 
Variable  Band- Pass  Filter  Model  310-AB  was  obtained,  although  this  was  not 
its  original  application  (First  Quarterly  Report,  p.  7). 

4. 3.2.  5  Function  Generator 

The  function  generator  equipment  used  to  supply  a  variety  of  scanning 
waveforms  included  the  following:  Hewlett-Packard  Test  Oscillator  Model  202A 
supplied  the  necessary  sine-wave  and  triangular  scans.  The  sawtooth  scan  was 
obtained  from  a  Tektronix  oscilloscope. 
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4.4  ARTICULATION  MEASUREMENTS 


4.4.1  Introduction 

The  measurements  here  obtained  were  articulation  scores  for  bandAvidth- 
sampled,  time-sampled,  and  time -frequency  sampled  speech.  The  measure¬ 
ments  obtained  and  the  conditions  of  the  articulation  testing  of  the  auditors  em¬ 
ployed  are  outlined  and  discussed  in  this  section.  Since  the  essential  element 
of  this  project  was  to  test  the  effectiveness  of  the  speech  compression  system 
on  the  articulation  scores  of  auditors,  these  will  be  treated  enbloc  here.  Data 
and  other  measurements  on  the  electronics  of  the  system  employed  will  be  found 
in  other  portions  of  this  report. 

4.4.2  Articulation  Measurement  Conditions 

Articulation  scores  may  be  obtained  by  a  variety  of  techniques  using  phonet¬ 
ically  balanced  word  lists.  *  In  this  project,  a  single  articulate  speaker  was  first 
selected  for  consistency.  Several  speaker  candidates  recorded  the  phonetically 
balanced  word  list  and  the  articulation  scores  of  five  listeners  were  examined. 
Since  the  chosen  speaker  JM  rated  highest,  JM  then  recorded  the  phonetically  bal¬ 
anced  word  lists  PB  and  PB  .  Obtained  from  the  Harvard  Psycho-Acoustic  Lab- 
oratory,  each  list  contains  a  thousand  single  syllable  words  arranged  in  random 
order  so  that  no  associations  can  be  made  by  listeners.  Listeners  MM,  BS  and 
MF  were  similarly  chosen  from  other  candidates,  because  of  their  high  articu¬ 
lation  scores.  Each  listener  could  adjust  the  volume  for  personal  preference. 

To  avoid  possible  association,  random  groupings  of  words  within  these  lists  were 
prepared,  and  as  a  further  check  the  tests  were  repeated  for  purposes  of  comparison. 

*James  P.  Egan,  Articulation  Testing  Methods.  The  Laryngoscope, 
vol.  58,  pp.  955-991,  September  1948. 
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The  phonetically  balanced  word  lists  were  recorded  on  a  magfnetic  tape, 
one-hundred  words  constituting  a  run.  Bandwidth  limiting  was  used  to  establish 
a  reference  for  subsequent  testing.  The  speech  signal  was  then  time-sampled 
with  a  square  -  wave  configuration  adjustable  from  12  to  100  milliseconds,  with 
and  without  reverberation.  Similarly,  time-frequency  sampled  testing  with  saw¬ 
tooth,  sine-wave,  etc.  configurations  was  performed  and  scores  recorded  for 
all  runs. 

4.4.3  Articulation  Measurements  With  Band  Limited  Speech 

After  establishing  the  conditions  for  determining  articulation  scores ,  we 
proceeded  to  measure  the  articulation  scores  as  speech  bandwidth  is  varied.  The 
test  set-up  used  is  shown  in  the  block  diagram  of  figure  23.  Also  shown  in  this 
figure  is  a  frequency  response  curve  showing  two  bandwidths  located  at  3  db  and 
20  db  points.  To  ensure  that  the  information  content  of  speech  lies  within  the 
passband  of  the  filter,  wideband  noise  was  added  to  the  speech  signal  at  a  signal- 
to-noise  ratio  of  20  db. 

The  curve  of  figure  24  shows  the  relationship  of  articulation  scores  with 
bandwidth.  The  articulation  scores  point  up  the  significance  of  the  3  db 
bandwidth.  More  significant  is  the  fact  that  these  scores  will  serve  as  a  refer¬ 
ence  for  future  tests  taken  with  the  speech  bandwidth  compression  system.  The 
degree  of  compression  will  not  be  determined  by  the  narrowband  filter,  but  will 
be  dependent  upon  and  related  to  the  articulation  scores.  For  example,  if  the  com¬ 
pression  system  yields  an  articulation  score  of  73  and  its  bandwidth  is  1  kc,  the 
compression  ratio  is  determined  by  obtaining  the  effective  bandwidth  from  figure 
24  (1.25  kc)  and  dividing  by  the  bandwidth  of  the  narrow-bandpass  filter. 
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Therefore,  the  true  compression  ratio  is 


1.25  kc 
1  kc 


1.25 


Had  the  compression  ratio  been  determined  by  taking  bandwidth  ratios,  the 
ratio  would  have  been 

3  kc  ^  „ 


However,  it  is  obvious  that  the  latter  ratio  is  not  significant  since  there  is  no 
comparison  of  essential  equivalents. 


4.4.4  Articulation  Measurements  With  Time-Sampled  Speech 

Articulation  scores  for  time  sampled  speech  together  with  the  test  setup 
are  indicated  In  figures  25  and  26.  The  time  sampling  circuit  or  chopper 
schematic  is  shown  in  figure  27 . 

In  both  figures  articulation  scores  were  obtained  for  time-sampled  speech 
with  and  without  reverberation.  In  figure  25,  using  the  Fisher  Spacexpander, 
articulation  measurements  were  made  at  different  settings  —  0,  50,  and  95.  These 
settings  control  the  degree  of  signal  reflection  so  as  to  result  in  repetition  of  the 
original  input  with  decreasing  amplitude.  The  overall  effect  or  net  result  is  to 
lengthen  the  time  the  signal  exists.  Considering  this  effect  in  the  frequency  domain, 
the  response  of  the  ^acexpander  will  be  equivalent  to  the  response  of  a  comb  filter. 

The  results  shown  in  figure  25  shows  that  the  average  articulation  score  was 
lowered  by  8  to  10%  for  both  ^acexpander  reverberation  settings.  It  also  appears  as 
though  a  50  msec  scanning  period  contained  the  hipest  articulation  score  ^ich 
is  the  optimum  period  described  in  the  original  proposal. 
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Figure  26  shows  that  the  articulation  score  has  been  improved  for  all 
scanning  periods  using  the  Autovox  as  a  reverberator.  Since  this  data  was 
encouraging,  we  decided  to  use  the  Autovox  in  our  system. 

Other  reverberator  devices  used  included  a  3  msec  delay  line  (ESC) 
and  a  3  msec  delay  line  (ESC)  with  a  feedback  amplifier.  Figures  28  and  29 
illustrate  the  test  setups  and  data. 

Initially,  this  data  had  been  taken  to  observe  the  effects  of  a  reverbera¬ 
tion  for  high  interruption  rates.  The  data  indicated  that  using  the  3  msec  delay 
line  as  the  reverberator,  with  a  sampling  time  of  3  msec,  and  a  scanning  period 
of  6  msec,  no  improvement  on  articulation  resulted.  Li  fact,  using  this  delay  line 
as  a  reverberator  decreased  the  articulation  to  about  10%. 

The  3  msec  delay  line  was  then  used  in  combination  with  a  feedback  amplifier. 
The  feedback  amplifier  schematic  shown  in  figure  30  uses  a  90  k  potentiometer 
to  vary  the  loop  again,  A/3.  Note  that  the  amplifier  output  is  terminated  in 
approximately  600  ohms  to  match  the  characteristic  impedance  of  the  delay  line 
and  that  the  frequency  response  of  the  amplifier  shown  in  figure  31  covers  minimum 
and  maximum  gain  settings.  Observing  the  block  diagram  of  the  combined  delay 
line  and  amplifier,  the  initial  signal  is  repeated  with  a  decreasing  amplitude  con¬ 
trolled  by  loop  gain.  Here  the  repeated  signal  will  overlap  with  the  following 
signal  sample,  but  by  adjusting  A^,  the  amplitude  will  be  sufficiently  low  not  to 
distort  the  succeeding  signal  sample,  e.g.  using  an  A/S  of  0.5  improved  articu¬ 
lation.  Sound  spectrographs  for  different  A/8  settings  figure  32,  pictorially 
show  the  degree  of  reverberation  overlap. 
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4.4.5  Articulation  Measurements  with  Tima- Frequency  Sampled  Speech. 

Using  the  project  system,  the  time-frequency  sampling  data  for  com¬ 
pression  ratios  of  two  and  four  are  shown  in  figures  33  and  34.  Data  was 
first  taken  on  the  compression  ratio  of  two,  because  if  results  were  favorable 
at  this  level  we  could  then  proceed  to  take  data  for  the  higher  compression 
ratios.  Although,  results  on  the  compression  ratio  of  four  were  satisfactory 
further  investigation  of  still  higher  compression  ratios  was  not  possible  be¬ 
cause  of  time  and  budget  limitations . 

Data  for  compression  ratio  two  is  shown  in  figure  33.  We  may  con¬ 
clude  from  this  data,  that  the  type  of  scanning  waveform  does  not  influence 
articulation  scores  appreciably  v4iether  with  or  without  reverberation.  In 
some  instances  reverberation  will  improve  articulation  for  a  specific  wave¬ 
form  and  scanning  rates.  Note  that  in  general  the  articulation  scores  with 
and  without  reverberation  are  about  90%,  which  constitutes  fairly  good  intelli¬ 
gibility  for  systems  of  this  type.  The  only  disturbing  factor  encountered  was 
the  presence  of  noise  when  using  the  sawtooth  scanning  waveform.  As  mentioned 
previously,  this  is  due  to  the  generation  of  high  order  transients  inherent  in  the 
discontinuities  of  this  waveform. 

Articulation  scores  for  the  compression  ratio  of  four  as  shown  in  fig¬ 
ure  34  showed  about  10%  lower  values  than  scores  for  the  compression  ratio  of 
two.  One  pertinent  fact,  however,  is  that  reverberation  on  the  average  improved 
articulation  scores. 
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5.  OVERALL  CONCLUSIONS 

Results  of  the  speech  bandwidth  compression  tests  show  that  fairly  good 
articulation  scores  were  obtained.  Although,  limited  compression  ratios  were 
used,  higher  compression  ratios  would  have  yielded  satisfactory  articulation 
scores  although  not  as  high  as  the  ones  used  in  this  project.  We  realize  that 
the  components  used  had  limitations,  mainly  the  narrow  bandpass  filter  per¬ 
mitted  distortion  which  did  not  allow  correct  restoration  of  the  speech  spec¬ 
trum.  This  was  reflected  in  the  lower  articulation  scores  obtained.  Had 
greater  effort  been  directed  toward  improving  and  refining  the  time  synchro¬ 
nization  of  the  system,  higher  articulation  scores  would  have  resulted.  This 
is  verified  by  the  experimental  data  taken  with  and  without  the  Golay  delay  line. 
In  fact,  the  articulation  scores  continually  improved  with  the  continual  adjust¬ 
ment  of  time  delay  for  correct  s3nichronization. 

Using  reverberators  for  interpolating  between  samples  resulted  in  artic¬ 
ulation  scores  highly  dependent  upon  the  reverberator  used.  For  example,  the 
Autovox  proved  to  be  more  effective  than  all  others.  This  may  be  attributed 
to  the  fact  that  control  was  readily  available  by  a  number  of  repetitions  and 
■  that  the  repetitions  remained  fairly  constant  in  intensity,  and  overlap  could  be 
minimized. 

The  statistical  method  investigated  was  to  provide  a  means  of  determin¬ 
ing  a  periodic  scanning  waveform.  Initially  our  intention  was  to  make  measure¬ 
ments  on  distribution  of  zero  crossings  on  infinitely  clipped  speech.  Results 
shown  in  figure  10a  constitute  the  extent  of  our  progress.  Here,  the  dis¬ 
tributions  show  the  relative  frequency  of  occurrence  of  particular  periods  of 
speech  sigpials.  No  further  extension  of  this  work  was  possible  due  to  limita¬ 
tions  of  time  and  remaining  funds  on  this  project. 
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The  system  appears  to  have  promise  for  small  compression  ratios. 

Its  simplicity  makes  for  a  ruggedness  and  a  reliability  of  performance. 

6.  RECOMMENDATIONS 

Although  this  system  provided  data  for  compression  ratios  of  4:1,  the 
potential  available  with  extended  parameters  could  have  furnished  compression 
ratios  as  high  as  8:1.  Modifications  required  to  achieve  such  a  compression 
ratio  would  demand  further  investigation  of  phase  dispersion  and/or  time  syn¬ 
chronization.  Considering  available  commercial  equipment,  e.g.,  an  additional 
Autovox  and  a  narrow  bandpass  filter  with  linear  phase  characteristics,  the 
present  system  could  have  been  extended  to  the  compression  ratios  mentioned 
previously.  However,  because  of  the  time  and  fiscal  limitations  of  the  project, 
this  was  not  feasible. 

As  a  result  of  the  work  on  the  present  system,  a  system  concept  employ¬ 
ing  modified  filter  scanning  techniques  has  been  developed.  This  concept  vis¬ 
ualizes  a  sampling  of  the  speech  spectrum  by  locating  two  filters  to  intercept 
changes  of  formant  frequencies  and  transmit  these  changes  over  a  narrow  band 
communication  link. 

A  technical  proposal  of  this  system  is  appended  to  this  report  as  Appendix  A. 
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(A)  SOUND  8PECTR06RAM  OF 
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Figure  1.  Spectrograms  of  Time- Frequency  Sampling  (Sawtooth) 

with  Reverberation 
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TRANSMITTER 


I  RECEIVER 


NOTE: 

IN  AN  ACTUAL  SYSTEM  THE  TRANSMITTER  SYNCHRONIZATION  SIONAL  WOULD  BE  TRANSLATED  TO 
THE  PASSBANO  OF  THE  TRANSMISSION  LINK  AND  DETECTED  AT  THE  RECEIVER  BY  A  FILTER. 


Figure  2.  Block  Diagram  of  Generalized  Speech  Compression  System 
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Figure  4.  Block  Diagram  of  Project  System  for  Time- Frequency  Sampled  Speech 


LEGEND: 

A.  SOUND  SPECTROGRAPH, DEECO  FILTER  750  CPS  BANDWIDTH 

B.  SOUND  SPECTROGRAPH,  DEECO  FILTER  WITH  1.5  MSEC  DELAY 


Figure  6.  Sound  Spectrographs  for  (a)  Uncompensated,  and 
(b)  Compensated  Golay  Delay 
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Figure  7.  Block  Diagram  Test  Setup  for  Probability  Distribution  Analysis 
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Figfure  8.  Schematic  of  Clipper  Amplifier 
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Figure  9.  Schematic  of  Distribution  Analyzer 
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A.  Probability  Distribution  of  Zero  Crossings  for 
Infinitely  Clipped  Speech 
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B.  Equal  Distribution  Areas  for  Time  Periods 


Figure  10.  Probability  Distribution  Curves  For  Speech  Zero  Crossings 
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Figure  12.  Frequency  Variation  of  Scanning  Oscillator  vs  Applied  Voltage 
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Figure  13.  Schematic  of  Phase  Shift  Oscillator 


Figure  14.  Variable  Resistance  Network  in  Phase  Shift  Oscillator 
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Figure  15.  Frequency  Response  of  3  msec  Delay  Line  (ESC  Corp.) 
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B.  Test  Fixture  of  Delay  Line 

Figfure  16,  Test  Delay  Line  Apparatus 
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(A)  PHASE  SHIFT  TEST  SETUP 


FOR  SIONAL  CHOPPER  SCHEMATIC  SEE  FIGURE  27. 

(B)  SIGNAL  SHIFT  TEST  SETUP 


Figure  17.  Block  Diagram  of  Test  Setups  for  Investigation  of  Test  Delay  Lines 
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(A),(B)  COPKR  WIRE  HELIX,  IB  TURNS,  0.052  IN.  DIA,  4SIN.  LG,  SIGNAL  INPUT  0.8SV 


(C)  SAME  HELIX,  2.2V  SIGNAL,  lOOK  CARBON  RESISTOR  IN  PARALLEL  WITH  OUTPUT 
CARTRIDGE,  SHIELDING  B  GROUNDING  IMPROVED. 


(D)  MME  TEST  FIXTURE  (C)  AFTER  DISMANTLING  •  REASSEMBLY. 


Figure  18.  Test  Delay  Line  Data  Curves 
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Figure  18.  (continued)  Test  Delay  Line  Data  Curves 
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Figure  19.  Deeco  Filter  Data  Curves 
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Figure  20.  Frequency  Response  of  Designed  Narrow  Band  Pass  Filter  (1500  cps) 
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B.  Filter  schematic  with  calculated  values 

Figure  21*  Designed  Narrow  Band  Pass  Filter  Schematics  (1500  cps) 
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Figure  22.  Calculated  Response  Curves  of  Designed  Narrow  Band  Pass  Filter 
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SPEAKER 


Figure  23 .  Block  Diagram  For  Articulation  Measurement  of  Band-Limited  S^ech 


Figure  24.  Articulation  vs  Bandwidth  Curve 
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Figure  25.  Articulation  Measurements  for  Time-Sampled  Speech 

Using  Spacexpander  Reverberator 


53 


95 


NO 

REVERBERATION 


•nI 

c: 


90 

85 

80 

75 

10  2  0  30  40  50  60  70  80  90  100 

SCANNING  PERIOD  (MSEC) 


ADDED 

REVERBERATION 


SCANNING  PERIOD  (MSEC) 

note:  scanning  period  to  time  sample,  2;i 


SPEAKER 


SPEECH 

CHOPPER 

Aiirnunv 

AUOIO 

(Y 

INPUT 

AMPLIFIER 

BLOCK  DIAGRAM,  TEST  SETUP 


Figure  26.  Articulation  Measurements  for  Time- Sampled 
Speech  using  Autovox  Reverberator 
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Figure  27.  Schematic  of  Time  Sampler  or  Speech  Chopper 


Data  on  Time- Sampled  Speech  (3/6  msec)  Using  Delay  Line 
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Figure  28.  Test  Setup  and  Data  on  Time-Sampled  Speech 
(3/6  msec)  Using  Delay  Line  Reverberator 
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Data  On  Time -Sampled  l^eech  (3/l5  msec)  Using  Open  Loop  Gain 
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Figure  29.  Test  Setup  and  Data  On  Time-Sampled  Speech 
(3/15  msec)  Using  Feedback  Reverberator 
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Figure  32  Speech  Spectrograph  "Rag"  (a)  Undistorted;  (b)  Chopped; 

(c)  Chopped  with  Feedback  A  /3>0.780;  (d)  Chopped  with 
Feedback  A  /9"0.200 
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Figure  33.  Average  Articulation  Scores  for  Time- Frequency  Scanning, 

Compression  Ratio  2:1 
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MOTIVATION 


This  proposal  describes  a  relatively  simple  method  for  compressing 
speech  bandwidth  by  using  two  scanning  filters.  The  ideas  in  this  text 
have  been  generated  from  work  performed  on  a  Filter  Scanning  Speech  Band¬ 
width  Compression  System  sponsored  by  the  Fort  Monmouth  Speech  Processing 
Section  on  Contract  No.  DA-36-039-SC-85-140 .  The  preliminary  technique 
had  been  described  to  Messrs.  M.  Weinstock,  J.  DeClerk  and  F.  Evans  of 
Fort  Monmouthj  New  Jersey  during  a  visit  to  PRD  Electronics  on  25  May  1961. 

In  a  subsequent  visit  to  Bell  Telephone  Laboratories,  Murr^Hill,  New 
Jersey,  on  13  June  1961,  Mr.  J.  DeClerk  of  Fort  Monmouth  and  Mr.  A.  P.  Al- 
banese  of  PRD  Electronics  presented  the  proposed  ideas  to  Drs.  E.  E.  David, Jr., 
J.  L.  Flannagan,  R.  Miller,  G.  Raisbeck  and  M.  R.  Schroeder.  This  visit  was 
to  ascertain  the  originality  of  the  system  and  to  determine  the  validity  of 
the  system.  The  results  were  favorable  because  no  system  of  its  kind  had 
been  fabricated,  and  the  ideas  have  promise  for  successful  compression  of 
speech  bandwidth. 

TECHNICAL  DISCUSSION  ON  PROPOSED  SYSTEM 

The  proposed  system  shown  in  Figure  1  is  a  simple  speech  bandwidth  com¬ 
pression  device  capable  of  sampling,  transmitting  and  restoring  those  parts 
of  speech  which  are  correlated  to  the  articulatory  function  of  speech.  Since 
the  word  articulation  can  be  attributed  to  the  movements  of  the  bars  or  for¬ 
mants,  then  the  proposed  transmitter  extracts  and  transmits  this  information 
and  the  proposed  receiver  restores  the  spectrum  into  intelligible  sounds. 

The  basis  for  speech  bandwidth  compression  can  best  be  understood  by  observ¬ 
ing  a  typical  sound  spectrogram  for  voiced  sounds  shown  in  Figure  2.  Here, 
the  transmitter  narrow  band  pass  filters  are  continuously  scanning  the 
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formant  movements.  The  resulting  signal  to  be  transmitted  Is  composed  of 
pertinent  spectrum  samples  corresponding  to  the  Information  content  of 
speech.  On  the  average^  the  transmitted  signal  will  contain  both  the 
first  and  second  formant  frequencies.  However,  when  third  formant  energies 
are  dominant,  they  will  occasionally  replace  the  lower  Intensity  second 
formants.  Although  the  Ideal  bandwidth  for  the  first  two  formants  are  7.1 
and  6,7  cps,  respectlvely^^^ ,  the  effectiveness  In  tracking  these  formants 
will  dictate  the  allowable  bandwidth  of  the  narrow  band  pass  filters.  It 
should  be  noted  that  the  reception  of  the  first  two  formants  are  sufficient 
to  maintain  high  Intelligibility. 

The  method  proposed  for  sampling  the  formant  movements  is  to  generate 
oscillator  control  voltages  from  the  speech  spectrum.  Figure  3a  illustrates 
a  short  term  power  spectrum  whose  peak  values  correspond  to  the  bars  or  for¬ 
mants  shown  In  Figure  2.  Integration  of  the  speech  signal  will  result  In  a 
slowly  varying  waveform  which  can  be  viewed  in  the  frequency  domain  as  low 
pass  filtering  shown  In  Figure  3b.  Alternatively,  differentiation  is  anal¬ 
ogous  to  high  pass  filtering  as  seen  in  Figure  3c.  If  Integrated  and  differ¬ 
entiated  speech  signals  are  Infinitely  clipped,  then  the  resulting  waveforms 
will  be  a  binary  array  of  random  pulse  widths  and  periods  as  shown  in  Figure 
4.  Due  to  the  filtering  aspects,  it  is  obvious  that  the  number  of  zero 
crossings  for  differentiation  is  greater  than  that  of  the  Integrated  case. 
This  is  an  Important  observation  because  the  number  of  zero  crossings  Is  a 
measure  of  the  average  frequency  of  these  waveforms.  Therefore,  the  inte¬ 
grated  and  differentiated  cases  are  respectively  correlated  to  the  first 
and  second  formant  frequencies.  Converting  the  average  frequency  of  these 
waveforms  Into  voltage  amplitude  via  FM  detection  will  result  In  ampli¬ 
tude  variations  corresponding  to  frequency  shifts  of  the  peak  energies  (see 
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the  enclosed  Appendix). 

As  seen  in  Figure  the  output  of  the  discriminators  are  used  to  con¬ 
trol  the  frequency  of  the  scanning  oscillators.  Although  modulating  the 
speech  with  the  scanning  oscillator  yields  two  sideband  frequencies  with 
suppressed  carrier,  the  narrow  band  pass  filters  transmit  only  the  differ¬ 
ence  frequencies  resulting  in  single  sideband  transmission.  This  condition 
can  be  depicted  as  though  the  filters  were  scanning  the  speech  spectrum  as 
shown  in  Figure  2.  The  transmission  signal  can  be  viewed  as  simply  trans¬ 
lating  the  speech  spectrum  up  to  the  transmission  pass  band.  The  control 
voltages  are  also  modulated  or  translated  to  the  pass  band  of  the  facility. 

It  is  expected  that  the  bandwidth  of  these  control  voltages  will  be  small, 
something  in  the  order  of  10  to  20  c.p.s.  each. 

At  the  receiver,  the  compressed  signal  and  control  voltage  information 
will  be  processed  by  the  inverse  action  of  the  transmitter.  Frequency 
selectivity  by  filtering  is  used  to  separate  the  four  basic  signal  components 
as  shown  in  Figure  1.  The  extracted  control  voltages  are  used  in  conjunc¬ 
tion  with  the  formant  compressed  signals  for  demodulation.  This  process  can 
be  visualized  as  retranslating  the  formant  compressed  signal  frequencies  to 
audible  sounds.  Since  the  audible  sounds  are  the  difference  frequencies,  it 
is  questionable  whether  or  not  low  pass  filtering  is  required  because  the 
sum  frequencies  will  lie  outside  the  audible  spectrum.  The  output  of  both 
demodulators  are  combined  to  restore  the  formant  spectrum. 

PRELIMINARY  RESULTS 

A  crude  test  has  been  performed  on  differentiated  and  infinitely  clipped 
speech.  The  purpose  of  this  test  was  to  compare  the  spectrum  of  undistorted 
speech  to  that  of  processed  speech.  The  schematic  diagram  of  Figure  5  is  a 
differentiator,  amplifier, and  limiter.  A  speech  signal  applied  to  the  input 
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of  this  circuit  can  be  viewed  at  the  output  as  the  random  series  of  on-off 
pulses  shown  in  Figure  4b.  Using  this  circuit  in  conjunction  with  a  Missil- 
izer^  results  in  the  spectrogram  shown  in  Figure  6b.  Comparing  the  two 
spectrograms  of  Figure  6,  it  can  be  seen  that  the  first  formant  for  pro¬ 
cessed  speech  has  been  attenuated  and  the  second  formant  emphasized.  This 
agrees  quite  well  with  the  anticipated  system  behavior  described  herein. 

PROPOSED  PROGRAM 

A  program  of  research  and  development  is  proposed  leading  to  an  exper¬ 
imental  model  incorporating  the  speech  compression  ideas  herein  disclosed. 
Pertinent  questions  to  be  answered  by  such  a  program  are: 

1.  PM  Deceetor  Tima  Conafnt; 

Here  it  is  required  to  choose  a  suitable  time  constant  so 
that  the  amplitude  variations  of  the  control  voltages  are 
effectively  averaging  the  formant  movements.  It  can  be 
seen  that  extreme  time  constant  values  will  produce  im¬ 
pulse  and  d.c.  variations  which  are  not  applicable.  Since 
there  is  an  optimum  detector  time  constant  for  the  differ¬ 
entiated  and  integrated  case,  it  will  be  our  objective  to 
select  both  time  constants  for  optimum  system  performance. 

2.  Synchronization; 

Due  to  the  continual  frequency  variation  of  the  scanning 
oscillators,  the  transmitter  and  receiver  must  be  correctly 
synchronized  to  ensure  proper  restoration  of  the  speech 
spectrum.  One  difficulty  usually  encountered  with  systems 
similar  to  the  proposed  system  is  the  non-linear  phase  char¬ 
acteristic  of  the  narrow  band  pass  filters.  The  presence 
of  phase  distortion  will  result  in  time  delay  variation  as 
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a  function  of  frequency  within  the  filter  pass  band.  \t 
the  receiver  the  reconstructed  spectrum  will  be  shifted  in 
frequency  by  an  amount  proportional  to  the  frequency  loca¬ 
tion.  The  resulting  word  articulation  scores  will  be  ad¬ 
versely  affected  due  to  poor  spectrum  restoration.  It  will 
be  our  objective  to  rectify  this  matter  by  accounting  for 
distortions  present  in  the  filter  response  by  phase  compen¬ 
sation  and/or^  if  possible^  obtaining  filters  with  correct 
phase  response. 

3 .  Compression  Ratio; 

It  is  proposed  that  the  bandwidth  of  the  narrow  band  filters 
be  approximately  200  c.p.s.  each.  The  reason  for  this 
choice  is  to  determine  the  effectiveness  of  this  degree  of 
bandwidth  compression.  The  compressed  signal  pass  band  is 
expected  to  be  200  +  200  =  400  c.p.s.,  and  the  control  vol¬ 
tages  will  have  a  combined  bandwidth  of  40  c.p.s.  Therefore, 
for  a  total  transmission  bandwidth  of  440  c.p.s.,  the  expec¬ 
ted  degree  of  compression  is  almost  ~ 

suits  obtained  herein  will  determine  whether  or  not  further 
conqpresslon  ratios  are  advisable. 

It  should  be  appreciated  that  the  foregoing  questions  can  only  be 
answered  by  experimental  tests  not  answered  by  a  priori  by  analysis  since 
the  ear  brain  function  is  largely  unknown  and  the  properties  of  speech  are 
unpredictable. 

A  program  of  about  one  year  with  two  to  three  engineers  and  technical 
shop  services  is  considered  sufficient  to  arrive  at  an  experimental  model 
and  to  provide  results  of  the  expected  degree  of  compression.  Since 
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PRD  Electronics  has  been  actively  engaged  in  speech  conmunication  systems; 
such  asj  speech  bandwidth  compression^  voice  privacy,  and  problems  related 
to  the  transmission  of  binary  speech  codes,  our  facility  in  these  areas 
have  continually  expanded.  Many  technical  personnel  have  been  trained  in 
basic  communication  problems,  so  that  their  services  can  be  applied  to 
systems  similar  to  the  proposed  device. 
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APPENDIX 


(2\ 

It  has  been  shown  by  Licklider'  'that  integrated  and  differentiated 
infinitely  clipped  speech  articulation  scores  can  be  related  to  zero  cross¬ 
ing  information.  Observing  a  short  term  power  spectrum  for  a  typical  vowel 
sound  shown  in  Figure  3a,  we  can  resolve  the  effects  of  differentiation  and 
integration  by  viewing  the  result  as  low  and  high  pass  filtering  as  shown 
in  Figure  3b  and  3c  respectively.  Obviously  formant  separation  can  be  ob¬ 
tained  if  sharp  filters  are  used.  The  infinite  clipping  of  both  signals 
then  resolve  speech  information  into  the  location  of  the  zero  crossings. 

Since  infinitely  clipped  speech  signals  are  a  series  of  square  waves  of 
uniform  amplitude,  the  information  must  lie  in  the  average  frequency  vari¬ 
ation  or  zero  crossings.  If  one  considers  an  FM  signal  at  the  output  of  a 
limiter,  the  waveform  also  constitutes  a  series  of  square  waves  which  have 
frequency  variation.  Upon  injecting  this  signal  into  an  ideal  discrimina¬ 
tor,  the  output  voltage  will  be  directly  proportional  to  the  average  instan¬ 
taneous  frequency.  Here,  we  can  deduce  that  the  average  frequency  of  differ¬ 
entiated  and  infinitely  clipped  speech  will  be  greater  than  that  of  inte¬ 
grated  and  infinitely  clipped  because  of  the  filtering  aspects.  It  is  this 
condition  that  correlates  the  integrated  and  differentiated  waveforms  to 
the  first  and  second  formants  respectively. 

Consider  the  ideal  case  of  an  FM  signal  shown  in  Figure  7a,  and  assume 
the  modulated  signal  is  a  sine  wave.  The  mathematical  description  of  this 
wave  is  well  known  as'  ' 

v(t)  -  A  cos  (Wpt  +  sin  V^jt)  ■  A  cos  0  (t)  (1) 

where 


A  is  a  constant  equal  to  the  amplitude  of  the  waveform, 

Wq  is  the  carrier  frequency  in  radious/sec. 

is  the  modulation  index,  i.e.  the  ratio  of  frequency  variation 
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to  the  modulation  frequency,  and 
Wjjj  Is  the  modulation  frequency. 

The  characteristics  of  an  Ideal  discriminator  will  provide  an  output 
voltage  proportional  to  namely, 

'^out^^^  “  Wq  +  ^W  cos  volts  (2) 

The  FM  detector  output  (omitting  the  dc  term,  W^)  will  be  a  voltage 
signal  proportional  to  the  frequency  of  the  carrier.  If  the  signal  Is 
limited  so  that  It  appears  as  shown  In  Figure  7b,  then  an  alternate  solu¬ 
tion  for  Its  behavior  must  be  used.  The  technique  Introduced  by  Stompers^^^ 
which  makes  use  of  the  average  zero  crossings  of  the  wave  Is  applicable  to 
both  waves  shown  In  Figures  7a  and  7b  and  Is  also  valid  for  waveforms  which 
are  perturbed  by  external  sources,  e.g.  noise  and  speech  variation.  Stompers 
defines  the  Instantaneous  frequency  as 

X  no.  of  zero  crossings  In  T  sec. 

■  — ^  X  no.  of  positive  zero  crossings  In  T  sec. 

T 

Where  T  Is  chosen  In  the  Interval  — s —  T  ^ 

‘o  ° 

and  B  Is  the  signal  bandwidth. 

The  process  of  FM  detection  can  be  visualized  by  considering  the  count¬ 
ing  function  of  Figure  8.  For  each  successive  zero  crossing,  the  counting 
function  Increases  one  unit.  The  pattern  displayed  can  be  found  by  evaluat¬ 
ing  the  zero  crossings  of  equation  (1)  viz. 

Wot  +  -  (2n-l)  -J-  (3) 

which  can  be  solved  graphically  as  shown  in  Figure  9.  The  time  at  which  a 
zero  will  occur  Is  determined  by  the  Intersection  of  ^  sin  W^t  and  the 
parametric  (2n  -  1)  -  w^t  equations.  An  alternate  solution  for  the  count¬ 

ing  function  could  have  been  obtained  from  equation  (3)  If  we  plot  the  left 
side  as  a  continuous  function  and  then  choose  discrete  unit  values  along  the 
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ordinate.  From  these  figures,  the  output  voltage  of  the  FM  detector  Is 

•  Zk  ^ 

yielding  the  required  average  frequency  of  the  FM  waveforms. 

Extending  this  idea  to  differentiated  and/or  integrated  and  infinitely 
clipped  speech^  the  discriminator  process  can  be  exploited  by  observing  Fig¬ 
ure  10.  Assume  a  typical  speech  pattern  shown  in  Figure  10a.  It  is  desired 
to  obtain  a  voltage  amplitude  measure  of  the  average  frequency  of  this  wave¬ 
form  as  a  function  of  time.  FM  detection  will  provide  output  voltage  pro¬ 
portional  to  the  average  frequency  within  a  time  interval  T.  For  example^ 
the  process  the  infinitely  clipped  signal  follows  is  differentiation,  as 
shown  in  Figure  10b  and  detection  shown  in  Figure  10c.  The  detector  low 
pass  filter  output  voltage  will  be  proportional  to  the  number  of  impulse 
functions  within  the  time  constant  interval  as  can  be  seen  in  Figure  lOd. 
Therefore,  the  output  waveform  is  a  slowly  varying  signal,  correlated  to  the 
average  frequency  of  the  speech  zero  crossings.  This  signal  will  be  applied 
to  the  scanning  oscillator  of  the  system  to  scan  the  corresponding  speech 
formants  as  shown  in  Figure  1. 


9 


REFERENCES 


(1)  Flannagan^  J.  L.^  "Bandwidth  and  Channel  Capacity  Necessary 
to  Transmit  the  Formant  Information  of  Speech",  JASA,  1956, 
Vol.  28,  No.  4,  July  1956,  pp.  592  -  596. 


(2)  Llckllder,  J.  C.  R.  and  Pollack,  I.:  "Effective  on  Dlfferen- 
tatlon  Integration  and  Infinite  Clipping  upon  the  Intellig¬ 
ibility  of  Speech",  JASA,  1948,  Vol.  20,  p.  42. 


(3)  H.S.  Black:  "Modulation  Theory"  D.  Van  Nostrand  Co.,  Inc., 
1953. 


(4)  Stumpers,  F.  L.  H.  M. :  "Theory  of  Frequency  Modulation 
Noise",  Proc.  IRE,  Sept.  1948,  pp.  1081  -  1092. 


10 


LOCAL 

BALANCED 

fi 

MODULATOR 

u  ,  — 

Vi(t)  BANDWIDTH  Bi 


BAND  PASS 
FILTER 
AND 
SYNC 

COMPENSATOR 
CENTERED  AT 
f  1 

BWsAfs 


DISCRIMINATOR 


SCANNING 
OSCILLATOR 
flA-  270 
f  IB -730 


INTEGRATOR 

AND 

INFINITE 

CLIPPER 


BALANCED 

MODULATOR 


TRANSMISSION 
SPECTRUM 
f.l  flA  f2A  fa 


NARROW  BAND 
PASS  FILTER 
CENTERED  AT 
f  lA 

BW:  A  f  I 


TRANSMISSION 

LINK 


DIFFERENTIATOR 

AND 

INFINITE 

CLIPPER 


BALANCED 

MODULATOR 


NARROW  BAND 
PASS  FILTER 
CENTERED  AT 
fzA 

BW:  Afz 


rm-rLTL 


DISCRIMINATOR 


SCANNING 
OSCILLATOR 
TtA-BAO 
ftt-  3010 


V2( t)  BANDWIDTH  Bl 


LOCAL 

OSCILLATOR 


BALANCED 

MODULATOR 


BAND  PASS 
FILTER 
AND 
SYNC 

COMPENSATOR 
CENTERED  AT 


TRANSMISSION 
SPECTRUM 
f,.  f|A  fa*  fa 

!  I  I  I 

mi  1  [i 

t  I  I  I 


Bi  Afi  Afa  Ba 


TRANSMISSION 

LINK 


BAND  PASS 
FILTER 
FOR  I  8T 
FORMANT 
CONTROL 
VOLTAGE 
CENTERED  AT 
fi 


BAND  PASS 
FILTER 
FOR  1st 
FORMANT 
COMPRESSED 
SIGNAL 
CENTERED  AT 
flA 


BAND  PASS 
FILTER 
FOR  2no 
FORMANT 
CONTROL 
VOLTAGE 
CENTERED  AT 
fa* 


BALANCED 

DEMODULATOR 


Vi  ( t  ) 


SCANNING 

OSCILLATOR 


LOCAL 
OSCILLATOR 
CENTERED  AT 
fl 


2 


BALANCED 

DEMODULATDR 


SPEAKER 


BALANCEb 

DEMODULATOR 


SCANNING 

OSCILLATOR 


BAND  PASS 
FILTER 
FOR  2nd 
FORMANT 
CONTROL 
VOLTAGE 
CENTERED  AT 
ft 


Vf  ( f  ) 


BALANCED 

DEMODULATOR 


LOCAL 
OSCILLATOR 
CENTERED  AT 

fa 


Figure  1  BLOCK  DIAGRAM  OF  PROPOSED 
SPEECH  BANDWIDTH  COMPRESSION  SYSTEM 


(a)  Sound  Spectrogram  of  Voiced  Speech  Signal 


(b)  Narrow  Band  Pass  Filter  Scanning 
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Figure  7b  INFINITELY  CLIPPED  FM  WAVEFORM 


Figure  8  COUNTING  FUNCTION  vs  LINE 
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Figure  10  WAVEFORMS  OF  PROCESSED  SPEECH 
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